Meaningless comparisons lead to false optimism in medical machine learning

نویسندگان

Orianna DeMasi

Konrad P. Körding

Benjamin Recht

چکیده

A new trend in medicine is the use of algorithms to analyze big datasets, e.g. using everything your phone measures about you for diagnostics or monitoring. However, these algorithms are commonly compared against weak baselines, which may contribute to excessive optimism. To assess how well an algorithm works, scientists typically ask how well its output correlates with medically assigned scores. Here we perform a meta-analysis to quantify how the literature evaluates their algorithms for monitoring mental wellbeing. We find that the bulk of the literature (∼77%) uses meaningless comparisons that ignore patient baseline state. For example, having an algorithm that uses phone data to diagnose mood disorders would be useful. However, it is possible to explain over 80% of the variance of some mood measures in the population by simply guessing that each patient has their own average mood-the patient-specific baseline. Thus, an algorithm that just predicts that our mood is like it usually is can explain the majority of variance, but is, obviously, entirely useless. Comparing to the wrong (population) baseline has a massive effect on the perceived quality of algorithms and produces baseless optimism in the field. To solve this problem we propose "user lift" that reduces these systematic errors in the evaluation of personalized medical monitoring.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimism in Active Learning

Active learning is the problem of interactively constructing the training set used in classification in order to reduce its size. It would ideally successively add the instance-label pair that decreases the classification error most. However, the effect of the addition of a pair is not known in advance. It can still be estimated with the pairs already in the training set. The online minimizatio...

متن کامل

Optimism in Active Learning with Gaussian Processes

In the context of Active Learning for classification, the classification error depends on the joint distribution of samples and their labels which is initially unknown. The minimization of this error requires estimating this distribution. Online estimation of this distribution involves a trade-off between exploration and exploitation. This is a common problem in machine learning for which multi...

متن کامل

Learning-Based Energy Management System for Scheduling of Appliances inside Smart Homes

Improper designs of the demand response programs can lead to numerous problems such as customer dissatisfaction and lower participation in these programs. In this paper, a home energy management system is designed which schedules appliances of smart homes based on the user’s specific behavior to address these issues. Two types of demand response programs are proposed for each house which are sh...

متن کامل

Detection of Glioblastoma Multiforme Tumor in Magnetic Resonance Spectroscopy Based on Support Vector Machine

Introduction: The brain tumor is an abnormal growth of tissue in the brain, which is one of the most important challenges in neurology. Brain tumors have different types. Some brain tumors are benign and some brain tumors are cancerous and malignant. Glioblastoma Multiforme (GBM) is the most common and deadliest malignant brain tumor in adults. The average survival rate for peo...

متن کامل

Over-optimism in bioinformatics research

The problem of ”false research findings” in medical research has focused much attention in the last few years (Ioannidis, 2005). One of the main problems, termed as ”fishing for significance” in the present letter, is that researchers often (consciously or subconsciously) report results that are in fact the product of an intensive optimization, i.e. of multiple comparisons. Such results are typ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12 شماره

صفحات -

تاریخ انتشار 2017

Meaningless comparisons lead to false optimism in medical machine learning

نویسندگان

چکیده

منابع مشابه

Optimism in Active Learning

Optimism in Active Learning with Gaussian Processes

Learning-Based Energy Management System for Scheduling of Appliances inside Smart Homes

Detection of Glioblastoma Multiforme Tumor in Magnetic Resonance Spectroscopy Based on Support Vector Machine

Over-optimism in bioinformatics research

عنوان ژورنال:

اشتراک گذاری